ProvDS: Uncertain Provenance Management over Incomplete Linked Data Streams
نویسنده
چکیده
Data processing in distributed environments is often across heterogeneous systems, bearing the need to exchange provenance information, such as, how and when data was generated, combined, recombined, and processed. Distributed systems involve multiple participants and data sources which can produce unreliable, erroneous data. Besides, there maybe exists oceans amount of data to deal with, e.g., in fields such as Internet of Things (IoT) and Smart Cities. Therefore, dynamic stream-based data processing mechanisms are more reasonable in these environments. Hence, we propose provenance and recovery-aware data management techniques that take dynamic, incomplete streams as inputs, and simultaneously recover the missing data and compute the provenance over the reconstructed streams. Unlike traditional provenance management techniques, which are applied on complete and static data, our research focuses on dynamic and incomplete heterogeneous data.
منابع مشابه
Provenance Aware Linked Sensor Data
Provenance, from the French word “provenir”, describes the lineage or history of a data entity. Provenance is critical information in the sensors domain to identify a sensor and analyze the observation data over time and geographical space. In this paper, we present a framework to model and query the provenance information associated with the sensor data exposed as part of the Web of Data using...
متن کاملProvenance and Uncertainty
PROVENANCE AND UNCERTAINTY Sudeepa Roy Susan B. Davidson Sanjeev Khanna Data provenance, a record of the origin and transformation of data, explains how output data is derived from input data. This dissertation focuses on exploring the connection between provenance and uncertainty in two main directions: (1) how a succinct representation of provenance can help infer uncertainty in the input or ...
متن کاملUncertainty over Structured and Intensional Data
The World Wide Web contains vast quantities of information of an heterogeneous nature available to automated agents: trillions of Web pages, hundreds of millions of social messages per day on websites such as Twitter, hundreds of knowledge bases in the open linked data cloud with dozens of billions of semantic facts, some of them implying more facts through semantic reasoning rules. Traditional...
متن کاملMining Frequent Patterns in Uncertain and Relational Data Streams using the Landmark Windows
Todays, in many modern applications, we search for frequent and repeating patterns in the analyzed data sets. In this search, we look for patterns that frequently appear in data set and mark them as frequent patterns to enable users to make decisions based on these discoveries. Most algorithms presented in the context of data stream mining and frequent pattern detection, work either on uncertai...
متن کاملQueueing Analysis of Continuous Queries for Uncertain Data Streams Over Sliding Windows
With the rapid development of data collection methods and their practical applications, the management of uncertain data streams has drawn wide attention in both academia and industry. System capacity planning and Quality of service (QoS) metrics are two very important problems for data stream management systems (DSMSs) to process streams e±ciently due to unpredictable input characteristics and...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017